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ABSTRACT 

This paper presents a new derivation of the Generalized Poisson distribution. The 
derivation is based on the barrier crossing statistics of random walks associated with 
the Poisson distribution. A simple interpretation of this model in terms of a single 
server queue is also included. 

In the astrophysical context, the Generalized Poisson distribution is interesting 
because it provides a good fit to the evolved, Eulerian counts-in-cells distribution 
measured in numerical simulations of hierarchical clustering from Poisson initial con- 
ditions. The new derivation presented here can be used to construct a useful analytic 
model of the evolution of clustering measured in these simulations. The model is con- 
sistent with the assumption that, as the universe expands and the comoving sizes of 
regions change as a result of gravitational instability, the number of such expanding 
and contracting regions is conserved. The model neglects the influence of external 
tides on the evolution of such regions. Indeed, in the context of this model, the Gen- 
eralized Poisson distribution can be thought of as arising from a simple variant of the 
well-studied spherical collapse model, in which tidal effects are also neglected. This has 
the following implication: Insofar as the Generalized Poisson distribution derived from 
this model is a reasonable fit to the numerical simulation results, the counts-in-cells 
statistic must be relatively insensitive to such effects. This may be a consequence of 
the Poisson initial condition. 

The model can be understood as a simple generalization of the excursion set model 
which has recently been used to estimate the number density of collapsed, virialized 
halos. The generalization developed here allows one to estimate the evolution of the 
spatial distribution of these halos, as well as their number density. For example, it 
provides a framework within which the halo-halo correlation functions, at any epoch, 
can be computed analytically. In the model, when halos first virialize, they are uncor- 
rected with each other. This is in good agreement with the simulations. Since it allows 
one to describe the spatial distribution of the halos and the mass simultaneously, the 
model allows one to estimate the extent to which these halos are biased tracers of the 
underlying matter distribution. 

Key words: galaxies: clustering - cosmology: theory - dark matter. 



1 INTRODUCTION 

Consider an initially Poisson distribution of particles that 
clusters gravitationally as the universe expands. In this pa- 
per, the initial Poisson distribution will also be called the 
initial Lagrangian distribution. As time passes, the particle 
distribution evolves, as, for example, tightly bound virialized 
clusters (called halos, or dark matter halos, in this paper) 
form. Thus, the evolved distribution is different from the 
initial Lagrangian distribution. In what follows, the evolved 
distribution will be called the Eulerian distribution. The 



goal of this paper is to use the properties of the initial La- 
grangian distribution to derive a reasonable approximation 
to the form of the evolved Eulerian distribution. In the ab- 
sence of a model relating the two distributions, the only 
constraint is that required by mass conservation: the num- 
ber of particles in the initial and evolved distributions is 
the same, so the average density, n, in the two distributions 
must be the same. In what follows, quantities measured in 
the Lagrangian space will have a subscript '0', while those in 
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Eulerian space will not. In this notation, mass conservation 
implies that no = n. 

Studies of clustering from Poisson initial conditions 
(Itoh, Inagaki & Saslaw 1993 and references therein) show 
that when the initial, Lagrangian distribution is Poisson, 
then the evolved Eulerian distribution is Generalized Pois- 
son. This paper presents a model in which this is so. The 
model is consistent with three general hypotheses about the 
evolution of clustering. The first is the hypothesis that, in 
comoving coordinates, initially denser regions contract more 
rapidly than less dense regions, and that sufficiently un- 
derdense regions expand. The second assumption is that, 
as the universe evolves, the number of such expanding and 
contracting regions is conserved — only their comoving size 
changes. The third is that the influence of external tides on 
the evolution of such comoving regions can be neglected, 
if one is only interested in computing statistics such as 
the mass function of collapsed halos, or the distribution of 
counts in Eulerian cells. There are no compelling physical 
arguments for any of these assumptions, and initial particle 
configurations which violate some or all of these assumptions 
are relatively easy to construct. That the model predicts a 
counts- in-cells distribution which is a reasonable approxima- 
tion to that measured in the numerical simulations suggests 
that, at least for clustering from Poisson initial conditions, 
these simple assumptions may also be reasonably accurate. 



1.1 The Generalized Poisson distribution 

Since it plays a central role in this paper, various known 
properties of the Generalized Poisson distribution are sum- 
marized below. 

The Generalized Poisson distribution (Consul 1989) has 
the form 



p(N\V,b) 



N(l-b) 

Wi 



N(l -b) + Nb 



-N(l-b)-Nb 



(1) 



Here p(N\V, b) is the probability that a cell of size V placed 
randomly within a particle distribution contains exactly N 
particles. If n denotes the average density, then N = nV . 
In this paper < b < 1, and, for reasons discussed below, 
it will be supposed that b is not a function of V. The case 
b = is the Poisson distribution. 

Equation ([!]) is a Compound Poisson distribution (e.g. 
Saslaw 1989); it arises if point sized clusters, called halos 
in the following, have a Poisson spatial distribution, and 
the probability a randomly chosen halo contains exactly n 
particles is 



T](n, b) = 



(nbY 



(2) 



This is the Borel(fo) distribution (Borel 1942). In this paper, 
equation (Q) will be called the halo mass function. 

The Generalized Poisson distribution was first discov- 
ered in the astrophysical context by Saslaw & Hamilton 
(1984) (also see Sheth 1995a). It provides a good fit to the 
distribution of particle counts in randomly placed cells, pro- 
vided the particle distributions have evolved, as a result of 
gravitational clustering, from an initially Poisson distribu- 
tion (Itoh, Inagaki & Saslaw 1993 and references therein). 
In fact, the fits are significantly improved if b is allowed to 
increase to an asymptotic value as V increases. This scale 



dependence is simply a consequence of relaxing the assump- 
tion that Borel clusters are point sized, but still requiring 
that they have some finite size. The asymptotic value of 
b is that which would have characterized the distribution, 
had the clusters been point sized (Sheth & Saslaw 1994). 
For this reason, the asymptotic value of b is fundamental, 
and the point sized idealization useful. This paper is mainly 
concerned with the point sized idealization, so that, in what 
follows, b is independent of V. 

The point sized idealization is also motivated by the 
following observation. To a good approximation, the distri- 
bution of bound virialized halos in the numerical simulations 
is Borel(b). Thus, to a good approximation, clustering from 
Poisson initial conditions evolves in such a way that, at all 
times, particles are bound up in Borel(fr) halos, and, at the 
time when they first virialize, these halos have a Poisson 
distribution. The evolution of clustering is parameterized by 
the time dependence of b; it is zero initially, and it increases 
as the universe expands (Zhan 1989; Sheth 1995b and refer- 
ences therein). Therefore, in the remainder of this paper, b 
will be treated as a pseudo-time variable, and the Borel(fe) 
distribution will often be called the halo mass function at 
the epoch b. 

As V — ► 0, most cells in the Lagrangian and Eulerian 
distributions will be empty. Equation (|l|) shows that, in this 
limit, the probability that a cell is not empty is N(l — 6), 
and the probability that a non-empty cell contains exactly 
N particles is given by the Borel(fc) distribution. In other 
words, at the epoch 6, the halo mass function is the same as 
the vanishing- cell- size limit of the Eulerian counts in cells 
distribution (Sheth 1996a). This fact will be useful later. 

The Borel(b) distribution can be derived from a num- 
ber of different constructions, all of which are related to 
the Poisson distribution (Epstein 1983; Sheth 1995b; Sheth 
1996b; Sheth & Pitman 1997). In the context of this pa- 
per, all these constructions can be thought of as providing 
models that allow one to compute the Eulerian space dis- 
tribution, in the limit of vanishing cell size, given that the 
Lagrangian space distribution is Poisson. One of these con- 
structions, based on the statistics of random walk barrier 
crossings associated with the Poisson distribution, is the ex- 
cursion set model (Epstein 1983; Sheth 1995b). This paper 
shows how to derive the Generalized Poisson distribution 
from a simple generalization of this excursion set model. 
The generalization shows how to derive the Eulerian space 
Generalized Poisson distribution from the Lagrangian space 
Poisson distribution, for all cell sizes, and all times. 



1.2 Outline of this paper 

Section ^j] summarizes the random walk, excursion set 



2.2 



mod el w hich leads to the Borel(fe) distribution. Sections 
and |2.3| describe a generalization of this model which leads 
to a ne w de rivation of the Generalized Poisson distribution. 
Section 2.4 shows how to describe the spatial distribution of 
virialized halos within the context of this model. It shows 
that the model is consistent with the Compound Poisson 
interpretation of the Generalized Poisson distribution - in 
the model, Borel(fe) halos have a Poisson distribution at the 
time when they first virialize. Moreover, in the model, the 
V — > limit of the counts-in-cells distribution is, indeed, 
the halo mass function. This shows explicitly that the ex- 
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cursion set approach developed here is able to reproduce the 
known properties of the Generalized Poisson distribution. 
The relation between this model and the well-studied spher- 
ical collapse model (outlined in Appendix |a|) is discussed in 



Section 



2.5 



Section ^ contains a brief digression which relates the 
excursion set model of the previous section to a simple single 
server queue system. Section ^ discusses a scaling solution 
associated with the model that is analogous to the scaling 
solution found in Section 3.2 of Sheth (1995b). Section H 
discusses the associated two barrier problem. The solution of 
this problem may provide useful diagnostics in assessing the 
rate of evolution of the Eulerian statistics computed earlier 
in the paper. 

Clustering from more general initial conditions, using 
the techniques developed here, is treated in a forthcoming 
paper. 



2 POISSON INITIAL CONDITIONS AND THE 
GENERALIZED POISSON DISTRIBUTION 

This section presents a new derivation of the Generalized 
Poisson distribution. The derivation uses a simple general- 
ization of the excursion set model studied by Epstein (1983) 
and Sheth (1995b). 



2.1 The excursion set with constant barrier 

Suppose that the initial Lagrangian distribution is Poisson, 
with mean density n. This means that a volume of size Vo 
placed at a random position in this distribution will contain 
exactly TV particles with probability 



p(N\V ) 



No N e 



N 



where TVo = nVo- 



(3) 



Now choose a random particle of this distribution, and com- 
pute the density within concentric spheres centred on this 
position. Call the curve N(Vo) traced out by the number 
of particles contained within a sphere Vo centred on this 
particle, as a function of the sphere size Vo, a trajectory. 
Then each particle in the Poisson distribution has its as- 
sociated Lagrangian trajectory. Given S c o, Epstein (1983) 
derived an expression for the fraction of Lagrangian tra- 
jectories for which N(Vo) = nVb(l + S c o), and for which 
TV(Vq') < nV \l + S c0 ) for all V ' > Vo (also see Sheth 1995b; 
Sheth & Lemson 1998). 

Epstein argued that a given value of <5 c o > defines a 
series of volumes Vb(l) < Vb(2) < ... for which 



j/Vo(J)=n(l + 5 c0 )=n/b, 



where < b < 1. 



(4) 



The final equality defines 6 = 1/(1 + 5 c q). The evolution of 
clustering enters through the time dependence of 5 c o, which 
decreases as time increases. It is in this sense that b is a 
pseudo-time variable; it is initially, and increases as the 
universe expands (Sheth 1995b). 

Epstein showed that the fraction of trajectories f c (j, 6i) 
for which Vo(j) is the largest value of Vo at which TV(Vb) = 
No/bi is 



/c(j,&l) = (l-&l) 



(jbi 



U - 1)! 



= j(1-6i)i?(j',6i), (5) 



where ri(j, b) is the Borel(&) distribution defined earlier. The 
mean of the Borel(b) distribution is ^ j n(j, b) — (1 — 6) _1 , 
a fact which will be useful later. 

Since / c (i, 6i) is a statement about the last crossing 
of the barrier (the dashed line that intersects the origin in 
Fig. [l]) by the random walk, excursion set trajectories, it will 
sometimes be referred to as the barrier crossing distribution. 
The subscript 'c' here denotes the fact that this is the dis- 
tribution of last crossings of a constant boundary; that is, 
5 c o is independent of Vo- 

Let f c (j, bi\N, b 2 ) denote the fraction of trajectories for 
which Vo(j) is the largest value of Vo at which TV(Vo) > 
TV0/61, given that, at some Vq = Vo(TV) > Vo(j) they have 
value JV(Vo') = TV0/&2, with b 2 > &i, and that all V > V {N) 
are less dense than Vo(TV). Then 



f c (j,b!\N,b 2 ) = TV 1 



TV 
j 

jbi 
b> 



r 

N N 

N-j-l 



(6) 



(Sheth 1995b). 

It is usual to associate these expressions about the 
statistics of trajectories crossing a constant barrier with 
statements about the number density of collapsed (point- 
sized) halos. Thus, the average number density of 61-halos 
that contain exactly j particles is 



no {j, bi ) = n 



fc(JM 



n(l - bi)rj(j,bi). 



(7) 



This assignment comes from assuming that the fraction of 
trajectories fc(j, 61) can be equated to the fraction of the 
Lagrangian space associated with regions containing j par- 
ticles with overdensity 61. The final equality comes from 
equation (|B|) and using the fact that no = n- 

Similarly, the average number of (j, 6i)-halos that are 
within an (TV, b2)-halo is 



.\'<j.l» V.M = [j) /.(.m'mI.Y.M. 



(8) 



where TV > j and b 2 > 61 . 

If the trajectories are not centred on particles, then the 
barrier crossing distribution is 



Fc(J,bi)=h fcUM), 
and 

F c (J,h\N,b 2 ) = (61/62) fc(j,h\N,b2 



(9) 



(10) 



However, the number density of associated regions is the 
same as before (Sheth & Lemson 1998). 

Consider an (TV, &2)-halo that is known to contain m 
61 subhalos, of which n\ are singles, n 2 are doubles and so 
on. Thus, Tij — m, and mass conservation requires 

that j nj — TV. Of course, b\ < b 2 . Let 7r[n|TV] denote 

one particular partition of TV, where n denotes the vector 
(ni,-- - ,njv), and let p(n; bi |TV; 62) denote the probability 
that the partition 7r[n|TV] occured. Sheth (1996b) shows that 



p(n;6i|TV;6 2 ) = 



(TV62i) m " 1 e- JV '' 2 



n 



(11) 
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slope = ( 1+6.) / / 
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Figure 1. Example of the trajectory (solid line) traced by the 
number, N, of Poisson— distributed objects within a sphere which 
contains Ng objects on average. The dotted line, which has unit 
slope, shows the curve traced out by the average density. The 
dashed line which starts at the origin shows the barrier consid- 
ered by Epstein (1983) and Sheth (1995b). The dashed line which 
starts at No = P is the shifted barrier studied here. 



where 621 = (62 —bi), and Nb 2 = nVo(N), is consistent with 
the excursion set model described above (also see Sheth & 
Pitman 1997, Sheth & Lemson 1998). 

For example, the average number of (J, 6i)-halos that 
are within (N, &2)-halos is 

{ nj -M\N;b 2 ) = ^nip{n;b x \N-M) = MQMWM), (12) 

7r[n|JV] 

and the sum is over the set of all distinct ordered partitions 
of N. The final expression is the same as equation (^) as 
required; the algebra leading to it is given in Appendix B of 
Sheth (1996b). 

The correlation between (i,b\)- and (j, &i)-halos that 
are within the same (N, &2)-halo is computed by a similar 
average over all partitions of N. Thus, 

(ninf,bi\N;b 2 )=N(j,bi\N,b 2 ) Af(i, bi\N - j,b'), (13) 

where 

(N - j)b' = Nb 2 - jbi 

(Sheth 1996b). This expression reflects the fact that, in the 
Lagrangian Poisson distribution, non-overlapping volumes 
are mutually independent (Sheth & Lemson 1998). These 
expressions will be useful later. 

This subsection shows how the statistics of the initial 
Lagrangian distribution can be used to derive some useful 
information about the statistics of collapsed halos, and of 
the distribution of halos within halos. While the language of 
halos is useful, it is important to remember that an (N, b 2 )- 
halo can also be thought of as a Lagrangian region of size 
Vb(AQ = Nb 2 /n (Mo & White 1996). Thus, expressions 
like 1l3h are related to the average number of (i, &i)- and 
(j, 6i)-halos that are both within the same Lagrangian re- 
gion of size Vo(N). It is in this sense that many of the ex- 
pressions above will be interpretted later in this paper. 



2.2 The excursion set with shifted barrier 

The previous subsection considered the distribution of last 
crossings, by random walk trajectories associated with the 
Poisson distribution, of a barrier which had shape fiVo(j) = 
jb. Instead, suppose that 



nV (j)=f3 + jb, 



with < 6 < 1. 



(14) 



and we seek an expression for the fraction of trajectories 
f(j,b,/3) for which Vo(j) is the largest value of Vb at which 
N(Vo) = (No —/?)/&. This is equivalent to considering the 
same problem as before, but with the barrier shifted to the 
right by a constant (3 (see Fig. 1). 

To compute f{j,b,f3), start with an arbitrarily small 
sphere centred on a particle. Since the distribution is Pois- 
son, counts in different volumes are independent, so 



l\V (j))f(j,b,f3), 



(15) 



where p(J — 1| Vb) is the Poisson distribution of equation (g), 
with Vo(j) given by equation (pr[), and / E (j, 6, /3) denotes 
the probability that no sphere larger than and concentric to 
Vo(j) is denser than the threshold value. Now, f E (j,b,(3) is 
the same as one minus the probability that Vo(N) is the 
largest volume denser than the threshold value, summed 
over all N > j. As a consequence of the Poisson assump- 
tion, f E (N,b, (3) is independent of N, so it can be written 
as f (b,f3) (e.g. Epstein 1983). This means that 

OO 

1 - f(b,tJ) = f(b,p)J2p( N -j\MN) - Vb(j))- (16) 

Since n[V (N) - V (j)} = (N - j)b, setting m = (N - j) 
means that the sum above is simply 



E 



{mb) m e" 



1-6 



(17) 



The final expression follows from recognizing that the term 
in the sum is (mb) times the Borel(6) distribution. This im- 
plies that 



f E (j,b,P)=f E (b,f3) = (1-6), 



so that 



f(N,b,0) = (1-6) 



Cg + Nb) N ~ 

(N-iy. 



-0-Nb 



(18) 



(19) 



This is the barrier crossing distribution associated with the 
shifted barrier. Following equation ([?]), this barrier cross- 
ing distribution can be associated with a number density of 
Lagrangian regions which contain iV particles: 

f(N,b,(3) 



n(N,b,/3) = n 



N 



(20) 



Let F(N, 6, (3) denote the barrier crossing distribution 
if the trajectory is centred on a random position, not neces- 
sarily on a particle. Then 

nV (N) 



F(N,b,f3) = [l-b)p(N\V (N) 



N 



f(N,b, /3). (21) 



The number density of associated regions, n(N, 6, (3), is re- 
lated to this fraction analogously to how it is related to 
f(N,b,/3). Namely, the barrier crossing distribution should 
be weighted by the number of trajectories associated with 
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it, so n(TV, b, 0) is F(N, b, (5) times the ratio of the average 
density n to nVo(TV), so it is given by equation (po|), 

2.3 Statistics in the Eulerian space 

Imagine partitioning the Eulerian space Vtot containing TVtot 
particles into a large number of cells, each of size V. Then 
the total number of such cells is Vtot /V. We will be interested 
in the limit in which TVtot /Vtot — > n as both TVtot — > oo and 
Vtot — » oo. Let p(N\V) denote the fraction of these cells that 
contain exactly TV particles. If n(N\V) denotes the number 
of cells containing exactly TV particles, then 

n(N\V) Vn(N\V) 



p(N\V) 



(22) 



Nt^ V; ot 
is said to be the Eulerian counts-in-cells distribution. 

Suppose that n and Vtot in the Lagrangian and Eulerian 
spaces are the same. Then no = n. Further, suppose that the 
number of regions which contain a specified number of par- 
ticles is the same in both the Lagrangian and the Eulerian 
spaces. Finally, suppose that the parameter (3, which con- 
trolled the shape of the barrier of the previous subsection, 
can be related to the Eulerian cell size V. Then equation ( |2o[ ) 
requires that 



Vn(TV|V~) 

so that 
p(N\V) 



Vn(TV|V) = 



nV/(TV|V) 
TV : 



f(N\V) 
TV/TV 1 



where N = nV. 



(23) 



(24) 



Equation ( |24| ) provides a relation between the barrier cross- 
ing distribution /(TV|V), which itself depends on the shape 
of the boundary and the initial Lagrangian field, and the Eu- 
lerian counts-in-cells distribution. By hypothesis, the shape 
of the boundary depends on the Eulerian scale V, so a given 
relation between /3 and V implies a specific form for the 
evolution of the comoving size s of regions. This is discussed 

Physically, equation (0) is 



2.5 



in more detail in Section 
consistent with the assumption that the difference between 
the particle distribution in the initial (Lagrangian) and final 
(Eulerian) spaces arises solely as a consequence of the fact 
that, although the comoving sizes of regions may change, 
the number of expanding and contracting regions in the two 
spaces is conserved. 

Let A = (1+5) = N/N. Then, p{A\ V) is the probability 
distribution function of the density in Eulerian space, and 



p(A|V) dA = / A p(A|V) dA = 1. (25) 
o Jo 

Following equation (^) , equation ( p"s| ) has an associated 
Eulerian counts in cells distribution: 



P (N\V,b,f3) 



fiN b,p) 

N/N 
TV(l-6) 
N\ 



(J3 + Nb) 



JV-l -0-Nb 



(26) 



where N = nV . This is the Generalized Poisson distribution 
(equation [l]). Normalization to unity requires that 

= N(l-b), where N = nV, (27) 

so the variance is TV/(1 — b) 2 , and this distribution is the 
same as that in equation (111). 



Equation ( |27| ) shows how the parameter (3 is related to 
the Eulerian cell size V. In particular, notice that as V — > 
0, then the barrier is shifted by j3 — > 0, so, in t his limit, 
the barrier shape is the same as that in Section 2.1. This 
shows explicitly that, as V — > 0, the Eulerian counts in cells 
distribution is the same as the halo mass function. 



2.4 The halo distribution 

Recall that the Generalized Poisson distribution with pa- 
rameter b can be interpretted as arising from a Poisson dis- 
tribution of Borel(fe) halos. This subsection shows that the 
derivation of the Generalized Poisson distribution presented 
earlier is consistent with this interpretation. 

Fig. [l] shows that the fraction of trajectories which last 
cross the constant barrier, parameterized by 6i, at j is equal 
to the fraction of those trajectories which last crossed the 
shifted barrier (associated with the Eulerian scale V and 
parameter b > bi) at TV > j, that had their last crossing of 
the constant barrier at j, summed over all N > j. A little 
algebra shows that 

oo 

h{jM) = ^U(jM\NM)f{N,b,P) 1 with6>6!, (28) 

where ,f c (j,bi) and fc(j\N) are given by equations (jE]) 
and ([|, and b 2 = N /N = (J3 + Nb)/N, as required by 
equation (114). This relation implies that 



n(j,bi)V = ^M{jM\NM)p{N\VAP) 

N>j 

= n (j,bi)V, 



(29) 

where the final expression is V times equation (Q), and fol- 
lows from setting 6 2 = [fi + Nb)/N. 

There are two reasons for writing this calculation out in 
detail. The first is simply to show explicitly that the mean 
number density of (j, &i)-halos in the Eulerian space is the 
same as in the Lagrangian space, as required. The second is 
that it shows how statistics in the Lagrangian space can be 
used to compute statistics in the Eulerian space. Recall that 
the number of regions containing TV particles is the same in 
both spaces, though the sizes Vb and V of the regions may 
be different. In particular, the Lagrangian scale associated 
with an Eulerian region of size V depends on the number of 
particles TV within it: V (TV) = (j3 + Nb)/n (equation [jj). 
So, to compute averages over Eulerian cells V, one simply 
needs to sum over the relevant Lagrangian regions Vb(TV) 
that now have Eulerian scale V, and weight by the Eulerian 
probability p(N\V) that the Eulerian region V contains TV 
particles. 

Thus, the cross-correlation between halos and mass, av- 
eraged over Eulerian cells V, can be computed as follows. 
Define 

N{jMNM 1 



5 h (i,6i|TV,6 2 ) 



(30) 



no{j,b 2 )V 

This is the number of (j, 6i)-halos that are within La- 
grangian regions of size Vb(TV) = Nbi/n and which con- 
tain exactly TV particles, divided by the average number of 
(j, 6i)-halos that are within Eulerian volumes of size V, mi- 
nus one. By hypothesis, the number of such Lagrangian re- 
gions is constant, only their size has changed. However, if 
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we now require that b 2 = (/3 + Nb)/N, then this means that 
the Eulerian size of such a Lagrangian region is V. So, if we 
require that b 2 = ((3+Nb)/N , then equation ( |3C| ) is the num- 
ber of (j, 6i)-halos in Eulerian cells V that contain exactly 
N particles, relative to the average number of (j, fei)-halos 
in Eulerian cells V, minus one. 

The cross correlation function between (j, fei)-halos and 
mass, averaged over all Eulerian cells V, is Sh(j,bi\N,b2), 
with Nb 2 = j3 + Nb, times 8 = (N-N)/N times the proba- 
bility that an Eulerian region V contains exactly N particles, 
summed over all N. Thus, 



&m(j,bi,P) = (S h (j,bi\N,b 2 ) 8 



E 

JV=j 

OO 

E 

JV=j 

i_ 

N 



(I- 



bi 



+ 



n{jM)V 

fc(j, bi\N, b 2 ) 
fcti, bi) 

(b-bi) 

l-6i)' 



P {N\V,b,l3) 




(31) 



where Af(j\N) is given by equation (BI), n(j, bi) by equa- 
tion (pg|), and p(AT| V, b, f3) by equation <pi|). The second line 
follows from equation (pSO]), and the fact that (A) = (1 + 5) 
1 (equationM), so (8) — 0. The third line follows from equa- 
tions (M), (g)and (|24|), and the last line from doing the sum, 
after using the fact that Nb 2 = (/3 + Nb). Notice that when 
b = bi then £ hm = j'/TV. 

The correlation between bi-halos of mass i and j, aver- 
aged over Eulerian cells V, arises as a result of two averages. 
The first is over all possible ways the N particles in an Eu- 
lerian cell V could have been partitioned into 6i-halos, and 
the second is over all possible values of N. The assumption 
that an Eulerian cell V is simply a Lagrangian region that 
has changed size allows us to assume that the first average 
(over all partitions of N) is the same in the two spaces. So, 
the result of this average is (m nf, bi\N\ b 2 ) of equation (|l3|), 
provided we set b 2 — (/3 + Nb) /N. All that remains is to av- 
erage this quantity over N, and then divide out the factors 
expected on average. Thus, 

(n,inj;bi\N;b 2 ) 
U(i,j, bi\V) = -), TTXvr P( N \V, b, P) - 1,(32) 



n(i,bi)Vn(j,bi)V 



where 



Nb 2 =P + Nb, and (N — j)b' = Nb 2 — jbi 



This sum can be solved analytically: 
C«+i)(6-6i) 



+ 



(b-h) 2 



N(l-b) N(l ~ 6) 2 (1 - fei) 2 



(33) 



When b = b\, £hh = 0, for all and V, which implies 
that the halos have a Poisson distribution. This is consistent 
with the fact that equation ( prj ) is a Compound Poisson dis- 
tribution which arises if Borel(fe) clusters have a Poisson spa- 
tial distribution (Saslaw 1989; Sheth & Saslaw 1994). When 
6i = 0, then all halos are single particles, so i = j = 1, 
and this expression gives the second factorial moment of 
the single particle distribution. Simple algebra shows that, 
in this li mit, it is equal to the second factorial moment of 
equation (j26|). Further, notice that when bi < b, then cor- 




t/tjsc) 

Figure 2. The physical radius of a perturbation in units of the 
spherical model turnaround radius, as a function of time in units 
of the spherical model turnaround time. The solid curves show 
the spherical model, and dashed curves show the model developed 
here. The two curves for each line type are for denser perturba- 
tions (which recollapse) and under-dense perturbations (which do 
not). 



relations between halos only depend on the sum of the halo 
masses, not on the masses of the individual halos themselves. 
This suggests that, in this model, halo-halo correlations arise 
because of volume exclusion effects only. That is, halo-halo 
correlations arise only because, initially, a (j, bi)-halo occu- 
pies a region Vo{j) = jbi/n, so no other halos can occupy 
this region. As time passes, such an object collapses to a 
region of zero Eulerian size, so the volume excluded by it 
becomes negligibly small. 



2.5 Relation to the spherical collapse model 

In the spherical Poisson model, b is related to the critical 
overdensity required for collapse: b = 1/(1 + S c o) (equa- 
tion Q). This relation for b, with equation ( |2t[ ) for fi and 
equation (jlj), imply that, as iV changes, the height relative 
to the average density of the shifted barrier considered here 
is 

When b -> 1, and N » 1, then 8 c0 < 1 and 7V/7V -> N/N, 
so 

Sco 



So — » S c o 



1 + 8' 



(35) 



In this limit, the relation between So and 8 is independent 
of V. This relation should be compared with that for the 
spherical collapse model given in Appendix |a|. 

In the limits 8 c o 1 and N S> 1, equation (^) suggests 
the following model for the collapse of objects. Let R(z) 
denote the comoving size of an object at the epoch z. Then 
R(z) = Ro initially. If the object is in an underdense region, 
then its comoving size will increase, else it will decrease. 
Trajectories with extrapolated linear overdensity So greater 
than S c o are associated with collapsed objects. Collapsed 
objects have R(z) — 0. Until collapse 



© 0000 RAS, MNRAS 000, 000-000 



The Generalized Poisson distribution 7 



V 



«8 



= 1 - 



So/(l 



(36) 



The radius of an object in proper, physical coordinates 
is Rp(z) — R(z)/(1 + z). Objects which collapse have a 
turnaround radius-the maximum value that R p (z) attains. 
This occurs at 



(l + 2ta) = 

O OcO 

at which time 

R(zte.) _ 1 



ft) 



41/3' 



(37) 



(38) 



Figure g shows that, for overdense perturbations, 
turnaround in this model (dashed lines) occurs later, and 
at a larger radius, than in the spherical model (solid lines). 
In contrast, underdense regions expand less rapidly in this 
model than in the spherical model. 



3 THE ASSOCIATED QUEUE 

The excursion set problem studied above can be understood 
in terms of a simple queue system. A single server queue 
with deterministic service time < 6 < 1 and Poisson ar- 
rivals with unit mean, which starts with one customer at 
the initial time, can be expressed in terms of the random 
walk problem studied by Epstein (1983) and Sheth (1995b). 
The time parameter in the queue system is like the cell size 
parameter in the excursion set model. 

Consider the probability B(j,b) that, after exactly j 
customers have been served, the queue is empty for the first 
time. Then f(j, b) of equation (||) is j times this probability 
times (1 — 6). The (1 — 6) factor simply comes from the ad- 
ditional constraint in the excursion problem on the number 
of particles within volumes larger than the critical volume 
Vo(j). This constraint is not present in the queue model, 
since we have made no assumption about what happens in 
the queue after the end of the first busy period. 

Figure ^ shows clearly that the same queue system, 
started with m customers at the initial time, is related to the 
excursion set problem studied in the previous section. Let 
B(N, b\m) denote the probability that exactly N customers 
were served in the first busy period, given that there were 
exactly m customers in the queue initially. Tanner (1953, 
1961) shows that 

and he also discusses the origin of the (m/N) term. Notice 
that B(N,b\l) is the Borel(6) distribution. 

The excursion set probability f(N, b, (3) of equation ( |l9| ) 
is the same as the probability that there were exactly m 
customers at the start, times (N/m) B(N, b\m), times (1 — 6), 
summed over all possible values of m. In the excursion set 
problem, the probability that there are exactly m customers 
at the start is simply 



(m — 1)\ 
(Figure [j]), so that 



(40) 



f(N,b,(3) = J2(l-b)(N/m)B(N,b\m)p(m,f3). (41) 

m = 

Appendix D of Sheth (1996b) shows that the right hand side 
of this expression is the same as that on the right hand side 
equation (|l9|). The corresponding expression for trajectories 
not necessarily centred on a particle, F(N, 6, /3), can also be 
derived in this context. Simply set p(m, jS) to (P/m) times 
the expression above, and do the sum. Sheth (1996b) dis- 
cusses a branching process derivation of this formula. Thus, 
this section shows how that branching process, this queue 
model, and the excursion set model of the previous section 
are all interrelated. 



4 A SCALING SOLUTION 

This subsection extends the results of Sheth (1995b), Sec- 
tion 3, for a constant barrier to the shifted barrier considered 
in this paper. In particular, it shows that the shifted barrier 
problem has a scaling solution that is analogous to the one 
associated with the constant barrier. 

Suppose that the underlying distribution is not Poisson, 
but is Compound Poisson. This means that equation (^|) 
must be replaced with PcpMVo(ti)]. 

Then 



f CP (n,S) = fhp(n,S) /cp(n,<5), 



(42) 



where the first term is the probability that there are exactly 
n particles within Vb(n), and the second is the probability 
that all volumes larger than Vo(n) are less dense than it. 
The second term is obtained by noting that the argument 
leading to equation ([l8]) holds here also. Therefore, when 
Pcp{n\ Vo, bo) is the Generalized Poisson distribution with 
parameter bo, and Vo (n) is given by equation (fl4|), i.e., 

nV {n) = p + nb = N(l - 6) + nb = N n , 



then /qpd is given by an expression like (|16|) 
replaced by Pcp- Since n[Vo(n) — Vo(j)] = (n - 
m = n — j and 

B = 6o + 6(l-6 ) 



but with p 
j)b, setting 

(43) 



means that 



1 = 1+ y^m6(l-6 ) 



(mB) r 



f, 



E 

GPD 



m-0 



This sum is similar to that in equation (|17[). Thus, 
6(l-6o)' 



,E 

/GPD 



1 + 



1 - B 



(44) 



(45) 



The other term is slightly more complicated. Recall 
that the Generalized Poisson distribution with parameter 
60 can be understood as describing a Poisson distribution 
of point-sized clusters, where n(m,bo), the probability that 
a randomly chosen cluster contains exactly m particles is 
the Borel(6o) distribution. Since the mean of the Borel(6o) 
distribution is (1 — 60) _1 , the ratio of the density of cluster 
centres to that of particles is n c i us /n = (1 — 60), and 



/gpdW 



\ ^ n c i u 



m — 1 



mn(m, 60) 
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x -Pgpd (n - m 



V (n),b ) (46) 



(Sheth 1995b equation 26). Abel's generalization of the Bi- 
nomial formula (e.g. Sheth 1995b equation 30) reduces this 
to 



/gpd(h) = 



where 



(i-l)I 



9 + nB 



(47) 



l-B = (l-6)(l-6o) 



and 



= N(1-B). 



This has the same form as equation (|l9|). Notice that when 
feo = 0, then B — b, and this expression is exactly the same 
as equation (p^). This is sensible, since a Generalized Pois- 
son distribution with parameter bo — is just a Poisson 
distribution. 

Since 6 is a pseudo-time variable, the new definition 
of B simply means that the time parameter in this model 
is slightly different from that in the case of Poisson initial 
conditions. In other words, if the initial Lagrangian distri- 
bution is Generalized Poisson, rather than Poisson, then, 
except for the appropriate rescalings of the time-dependent 
parameters, none of the results of Section 2 are changed. 



5 THE TWO BARRIER PROBLEM 

Suppose b\ < b 2 . Let f(j, bi\k, b 2 ) denote the fraction of 
trajectories, centred on a particle, which have j particles 
when they last crossed the barrier with index b\, when it 
is known that they have exactly k particles when they last 
cross the barrier with index b 2 . When k > N, then the 
results of Sheth (1995b) imply that 



f(j, bi\k,b 2 



k-l\ 2 V k - M, 



3-1 J 2V*- 1 



x {iV j y- 1 ( 2 V h - l V j ) k - j -\ (48) 

where j < k, 

{V j = N{l-b 1 )+jb 1 , l V k = N{l-b l )+kbi, 
and 2 V k = N(l - 62) + jb 2 . 

(Notice that equation [] is this expression with {Vj = jbi, 
iVk = kbi, and 2 Vk = kb 2 .) This reflects that fact that a 
comoving volume which is denser than the average density 
at time b 2 will have been less dense at an earlier time. When 
k < N, then 



lV4 ~ 2V 3 



(2VkY 



(iVj - 2 v k 



\ j-k-l 



(49) 



but now k < j, since a comoving volume that is less dense 
than the average at some late time b 2 must also have been 
underdense at the earlier time 61 < 62, and its density will 
have decreased since the earlier time. This expression is the 
probability that a cell contains exactly k particles at time b 2 
given that at some earlier time 61 < b 2 it contained exactly 
j particles. These expressions are the analogues of equa- 
tion(|). 

In the limits k ^> N and k S> j, the use of Stirling's 
approximation shows that 



fU\k)^ f(j,B,(3'), (50) 
where f(j, B,f3') has the same form as equation ( |l9| ) with 
61 jV(l-bi) 



B 



n 2 V k 



and = 



n 2 V k 



(51) 



This is similar to the rescaling associated with the constant 
barrier: When N j and b 2 > 61, then 



fo(J,bl\N,b 2 ) -/eO', 



(52) 



This rescaling, and the scaling solution of the previous sec- 
tion, suggest that there may be a merger-fragmentation 
model of the type described by Sheth & Pitman (1997) as- 
sociated with the Generalized Poisson distribution. 

For trajectories that are not necessarily centred on par- 
ticles, the expression corresponding to equation (^]) is 

F(jM\k,b 2 ) = Q 2Vk { -J k (tVjY ( 2 V k - iVi) k ~ j -\ (53) 

when k > N, and F(k\j) is related to f(k\j) similarly. These 
expressions follow from arguments given in Sheth & Lemson 
(1998). Identities associated with Abel's generalization of 
the Binomial theorem show that all these expressions are 
normalized to unity. 

This last expression is related to the following problem. 
Choose a random Eulerian cell of comoving size V in an iV- 
body simulation, and study the evolution of the mass within 
it. Let p(j,bi\k,b 2 ) denote the probability that at time bi 
there are exactly j particles within it, given that at some 
time 62 > 61 it is known to contain exactly k > j particles. 
Then p(j, bi\k,b 2 ) = F(j, b\ \k, b 2 ). These expressions show 
explicitly that, for some Eulerian cells, it may happen that 
the number of particles within the cell decreases initially 
and increases later. In other words, in the model, matter 
can flow in and out of Eulerian cells. 



6 DISCUSSION 

This paper presents a new derivation of the Generalized 
Poisson distribution. The derivation allows one to construct 
a useful model of hierarchical clustering from Poisson initial 
conditions. The resulting model is useful because the Pois- 
son assumption allows one to solve many problems that, at 
present, have no solution if more realistic initial conditions 
are used. 

The model is a simple generalization of the excursion 
set model developed by Bond et al. (1991). Their approach 
allows one to estimate the mass function of collapsed ha- 
los; the generalization presented here allows one to describe 
the spatial distribution of these halos as well. The model 
can also be thought of as a simple variant of the spher- 
ical collapse model. In the model, initially denser regions 
contract more rapidly than less dense regions, sufficiently 
underdense regions expand, the influence of external tides 
on the evolution of such regions is ignored, and the num- 
ber of expanding and contracting regions is assumed to be 
conserved. Strictly speaking, none of these assumptions can 
be justified physically. However, these simplifications mean 
that the model can be worked out relatively easily. Moreover, 
the Generalized Poisson distribution, derived after making 
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these assumptions, is a reasonably accurate fit to the Eule- 
rian counts-in-cells distribution measured in numerical sim- 
ulations of clustering from Poisson initial conditions. This 
suggests that, at least for estimating the evolution of the 
counts-in-cells statistic from such initial conditions, these 
simplifications are justified. 

In the model, a collapsed halo occupies a vanishingly 
small volume. In the simulations, collapsed halos have non- 
zero sizes — any given halo virializes at some fraction, typi- 
cally about one half, of its turnaround radius. This means 
that on scales smaller than that of a typical halo, the counts- 
in-cells distribution computed here will cease to be a good 
approximation to that measured in the simulations. As dis- 
cussed in the introduction, the fact that halos have non- 
trivial density profiles means that the b parameter in equa- 
tion (^) depends on scale. A reasonable approximation to 
the effects of this scale dependence can be computed from 
models, such as those proposed by Navarro, Frenk & White 
(1996), of the density profiles of collapsed halos (see Sheth 
& Saslaw 1994 for details). As Poisson initial conditions are 
not realistic anyway, this seems an unnecessary refinement 
to an already idealized model. 

As the basic model has worked out so easily, as it al- 
lows one to estimate the extent to which halos are biased 
tracers of the mass, and, most importantly, as it provides 
a reasonably accurate description of the evolution of clus- 
tering measured in numerical simulations, it seems worth 
extending it to describe clustering from more general initial 
conditions. This extension is in progress. 
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APPENDIX A: 
MODEL 



THE SPHERICAL COLLAPSE 



Consider a region of initial, comoving Lagrangian size Ro- 
Let So denote the extrapolated linear overdensity of this ob- 
ject. In units where the average comoving density is unity, 
there is a deterministic relation between the mass Mo within 
Ro'- Mo tx Ro provided So <C 1. As the Universe evolves, the 
size of this region changes. Let R denote the size of the 
region at some later time. Then the density within the re- 
gion is simply (Ro/R) 3 = (1 + S). In the spherical collapse 
model there is a deterministic relation between the initial 
Lagrangian size Ro and density of an object, and its Eu- 
lerian size R at any subsequent time. For an Einstein-de 
Sitter universe, 



Ro 
1 

l + z 



3 1 



10 | Jo | 

3 x 6 2/3 (9 



2/3 



20 



(Al) 



(e.g. Peebles 1980). If 5 < 0, (1 - costf) should be replaced 
with (cosh# — 1) and (9 — sin#) with (sinh# — 9). In this 
model, collapsing objects reach turnaround at 



(l + , ta )=4 1 / 3 ^, 
at which time 

-R(Zta) _ (1 + Zta) R p (Zta) 



Ro 



Ro 



6 4 1 / 3 
10 5 c0 ' 



(A2) 



(A3) 



For simplicity, consider the epoch when z = 0. Since (1+ 
5) = (R/Rof, this means that there is a relation between 
5 and (1 + 8) that is independent of R. Mo & White (1996) 
give the following approximation to this relation: 



So 



1.68647 - 1.35(1 + 8)~ 2/3 - 1.12431(1 + 5y 1/2 



+ 0.78785(1 +5Y 



(A4) 



These relations imply that to every pair (R, z) there is an as- 
sociated curve in the (So, Ro) plane, so there is a correspond- 
ing curve in the (So, Vo) plane. For a given Eulerian scale R, 
and a specified epoch z, equation (A4) gives what is effec- 
tively the boundary S sc (Vo\R) associated with the spherical 
collapse model. This barrier should be compared with that 
given by equation (pa). 
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